Aside

logo


jonatanpallesen
github.com/ymer
jsmp.dk


Download a PDF of this CV

Skills

Programming languages

Packages

Other

Main

Jonatan Pallesen

I am a data scientist with a PhD in genomics. I’m curious by nature, and enjoy the challenge of separating signal from noise to gain real insights.

Throughout my education and work I have acquired extensive theoretical and practical experience with data handling, statistics, machine learning, visualization and algorithms. I am a very skilled programmer in Python, R and Julia, with more than ten years of experience in the former two.

During my work as statistical analyst and as data scientist I have worked on a variety of different projects, for which I have been in charge of all stages of the analysis, from problem definition, data cleaning and quality control, to statistical analysis, model building and presentation.

Work Experience

Data scientist

Raven biosciences

N/A

2019

  • Lead data scientist
  • Working with a variety of projects in education, fintech and automated machine learning.

Statistical analyst

Aarhus university

N/A

2018 - 2015

  • Programming pipelines and tools, working with very large data sets, statistical analysis and machine learning.


Education

PhD, human genetics

Aarhus university

N/A

2015 - 2011

  • Thesis: Association studies of psychiatric disorders: On association of genes, gene sets and runs of homozygosity.

Visiting researcher

University of California, Berkeley

N/A

2012

MSc, molecular biology and computer science

Aarhus university

N/A

2011 - 2004





Selected public data science projects

I regularly make new analyses and visualizations on my blog

How to transform your data

N/A

N/A

2019

  • Using simulations to determine the optimal transformation for skewed variables

Blind auditions and gender discrimination

N/A

N/A

2019

  • Re-analysis of a seminal study. (Used for an article in the Wall Street Journal)

Project Euler

N/A

N/A

2016

  • Computational problems solved in Python and Julia


Selected Publications

Discovery of the first genome-wide significant risk loci for attention deficit hyperactivity disorder

Nature Genetics. (link)

N/A

2019

  • Demontis et al.

Identification of common genetic risk variants for autism spectrum disorder

Nature Genetics. (link)

N/A

2019

  • Grove et al.

LandScape: a simple method to aggregate p-values and other stochastic variables without a priori grouping

Statistical Applications in Genetics and Molecular Biology. (link)

N/A

2016

  • Joint first author with Carsten Wiuf.